Entity resolution for probabilistic data
نویسندگان
چکیده
منابع مشابه
Entity resolution for probabilistic data
Entity resolution is the problem of identifying the tuples that represent the same real world entity. In this paper, we address the problem of entity resolution over probabilistic data (ERPD), which arises in many applications that have to deal with probabilistic data. To deal with the ERPD problem, we distinguish between two classes of similarity functions, i.e. context-free and context-sensit...
متن کاملOn Entity Resolution for Probabilistic Data
Entity resolution (ER) is the problem of identifying duplicate tuples, which are the tuples that represent the same real-world entity. There are many real-life applications in which the ER problem arises. These applications range from news aggregation websites, identifying the news that cover the same story, in order to avoid presenting one story several times to the user, to the integration of...
متن کاملEntity Resolution for Uncertain Data
Entity resolution (ER), also known as duplicate detection or record matching, is the problem of identifying the tuples that represent the same real world entity. In this paper, we address the problem of ER for uncertain data, which we call ERUD. We propose two different approaches for the ERUD problem based on two classes of similarity functions, i.e. context-free and context-sensitive. We prop...
متن کاملProbabilistic Models for Collective Entity Resolution Between Knowledge Graphs
The growing popularity of structured knowledge bases such as knowledge graphs necessitates integrating multiple knowledge sources. A key component of this integration is entity resolution (ER), reconciling instances of a single entity occurring in different knowledge graphs. In contrast to the conventional ER problem setting, we consider the scenario where ER judgments for related entities are ...
متن کاملMulti-Source Entity Resolution for Genealogical Data
In this chapter we study the application of existing entity resolution (ER) techniques on a real-world multi-source genealogical dataset. Our goal is to identify all persons involved in various notary acts and link them to their birth, marriage and death certificates. We analyze the influence of additional ER features such as name popularity, geographical distance and co-reference information o...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Information Sciences
سال: 2014
ISSN: 0020-0255
DOI: 10.1016/j.ins.2014.02.135